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ABSTRACT 

Escherichia coli Exonuclease I (Exol) digests single- 
stranded DNA (ssDNA) in the 3-5' direction in a 
highly processive manner. The crystal structure of 
Exol, determined previously in the absence of DNA, 
revealed a C-shaped molecule with three domains 
that form a central positively charged groove. The 
active site is at the bottom of the groove, while an 
extended loop, proposed to encircle the DNA, 
crosses over the groove. Here, we present crystal 
structures of Exol in complex with four different 
ssDNA substrates. The structures all have the 
ssDNA bound in essentially the predicted manner, 
with the 3'-end in the active site and the down- 
stream end under the crossover loop. The central 
nucleotides of the DNA form a prominent bulge 
that contacts the SH3-like domain, while the nucleo- 
tides at the downstream end of the DNA form exten- 
sive interactions with an 'anchor' site. Seven of the 
complexes are similar to one another, but one has 
the ssDNA bound in a distinct conformation. The 
highest-resolution structure, determined at 1.95 A, 
reveals an Mg^^ ion bound to the scissile phosphate 
in a position corresponding to Mg^ in related two- 
metal nucleases. The structures provide new 
insights into the mechanism of processive digestion 
that will be discussed. 

INTRODUCTION 

Enzymes that processively digest single-stranded DNA 
(ssDNA) and RNA play important roles in genome main- 
tenance and gene regulation (1,2). While much is known 
about the chemistry of the nuclease reaction (3), it is less 
clear how processive nucleases are able to use the energy 
from phosphodiester bond cleavage to motor along their 



track as they digest it. Exonuclease I from Escherichia 
coli (Exol; Mr 54.5 kDa; 475 amino acids) is an Mg^"^-de- 
pendent enzyme that rapidly digests ssDNA in the 3'-5' 
direction to yield 5'-mononucleotides (4). It is involved in 
a number of different processes related to DNA repair and 
recombination, most notably methyl-directed mismatch 
repair, where it is one of three enzymes that can digest 
the 3'-ended strand containing the mismatched nucleotide 
(5). The enzyme digests ssDNA at a rate of ~275 nucleo- 
tides per second with a high degree of processivity (6,7). 
Exol forms a protein-protein interaction with E. coli 
ssDNA-binding protein (SSB), which stimulates its 
activity by loading it onto ssDNA substrates and unwind- 
ing secondary structures (8,9). The potency and 
processivity of Exol, combined with an unusual resistance 
to high salt, have made it attractive for use in new bio- 
technology applications such as nanopore DNA 
sequencing (10,1 1). 

The crystal structure of Exol (12,13) revealed a 
C-shaped molecule of three domains: an N-terminal 
nuclease domain (residues 1-201) with homology to the 
proofreading domain of E. coli DNA polymerase I and 
other DnaQ superfamily enzymes (14), a central domain 
with a portion that resembles an SH3 domain fold 
(residues 202-354) and a C-terminal a-helical domain 
(residues 359-475). Running down the center of the 
molecule is a positively charged groove with the nuclease 
active site at the bottom end and a loop connecting the 
SH3-like and C-terminal domains crossing over the top. 
The groove is long enough to accommodate ~ 12-1 3 
nucleotides of ssDNA, which is consistent with footprint- 
ing data based on quantifying the distribution of released 
products (6,15). Based on this structure, a model for the 
ExoI-ssDNA complex, and a mechanism for processivity 
involving topological threading of the ssDNA under the 
crossover loop, was proposed (12). 

We have determined crystal structures of Exol in 
complex with four different ssDNA substrates at reso- 
lutions ranging from 1.95 to 3.5 A. Each structure has 
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two ExoI-ssDNA complexes in the asymmetric unit, 
giving a total of eight independently determined 
complexes. In all of the complexes, the ssDNA binds in 
essentially the predicted manner, with the 3'-end in the 
active site and the downstream end under the crossover 
loop. While seven of the complexes have the ssDNA 
bound in a similar conformation, one of the complexes 
reveals a distinct binding mode. The possible relevance 
of these structures to the mechanism of processivity is 
discussed. 

MATERIALS AND METHODS 

Protein expression and purification 

The gene encoding Exol was amplified from E. coli 
genomic DNA and cloned into pET14b using the Ndel 
and BaniHl restriction sites. This vector expresses the 
protein with an N-terminal 6xHis tag followed by a 
thrombin cleavage site. The resulting plasmid was trans- 
formed into E. coli BL21(A1) cells (Invitrogen), grown in 
LB medium to OD of 0.5, induced by addition of 0.2% 
arabinose with shaking at 37°C for 4h, and harvested by 
centrifugation for lOmin at lOOOOg. Cells were resus- 
pended in Buffer A (50mM NaH2P04, 300 mM NaCl, 
10 mM imidazole, pH 8.0) and frozen at — 80°C. Frozen 
cells were thawed to 4°C, incubated with 1 mg/ml 
lysozyme, 1 mM PMSF, 1 mM pepstatin and 1 mM 
leupeptin, lysed by sonication (3x2 min at full power 
using a Branson sonifler 450) and centrifuged twice for 
30 min at 48 000^. The clarified supernatent was loaded 
onto a 10 ml column of nickel-charged chelating- 
sepharose fast flow (GE Healthcare), washed with Buffer 
A containing 30 mM imidazole and eluted in fractions 
with a gradient from 30 to 500 mM imidazole in Buffer 
A. Pooled fractions were incubated with 1000 U of 
thrombin (GE Healthcare) and dialyzed at 22°C into 
150mM NaCl, 133 mM NaH2P04, pH 7.4. The cleaved 
mixture was loaded back onto the nickel column, and flow 
through fractions containing untagged Exol was dialyzed 
into 20 mM Tris, pH 8.0, loaded onto a HiTrap QHP 
column (GE Healthcare) and eluted with a gradient 
from 0 to 1 M NaCl. Pooled fractions containing Exol 
were dialyzed into 20 mM Tris, 1 mM dithiothreitol 
(DTT), pH 8.0, concentrated to 26 mg/ml using a 
VivaSpin 20 lOK MWCO PES filtration device and 
stored in small aliquots at — 80°C. The purified protein 
contains an extra N-terminal Gly-Ser-His (GSH) 
sequence from the expression vector. Protein concentra- 
tions were measured by OD at 280 nm using an extinction 
coefficient calculated from the amino acid sequence. The 
H181A mutant of Exol was constructed using the 
QuikChange method (Stratagene), and the protein was 
expressed and purified as described above for wild-type 
Exol. 

Exonuclease reactions 

Purified Exol was incubated at 37°C with a 5'-fluorescein 
(FAM)-labeled 48-mer oligonucleotide that was 
purchased high performance Hquid chromatography 
(HPLC)-purified from Integrated DNA Technologies 



(IDT), and the digestion products were monitored by 
denaturing polyacrylamide gel electrophoresis. Reactions 
(25^1) contained 0.5 |iM Exol (WT or H181A), 1 ^M 5'- 
FAM-oligo, 10 mM HEPES, pH 8.5, 1 mM DTT, 0-2 mM 
MgCl2 or CaClj and 0-10 mM EDTA as indicated. 
Ahquots of lO^l were removed from each reaction at 
the indicated timepoints, quenched by addition of 
EDTA to 65 mM final concentration and loaded onto a 
22% polyacrylamide gel containing 7M urea in Tris/ 
borate/EDTA (TBE) buffer. Bands were visualized using 
a UV transilluminator. 

Crystallization and x-ray structure determination 

OHgonucleotides used for crystallization (5'-Cy5-dT13, 
5'-Cy5-dA13, dA16 and dT13) were purchased from 
IDT. 5'-Cy5 labeled ofigonucleotides were purified by 
reversed phase HPLC (IDT). For crystallization by 
hanging drop vapor diffusion, 10 mg/ml Exol in 20 mM 
Tris, 1 mM DTT, 10 mM EDTA, pH 8.0, was mixed with 
a 1.2 molar excess of oligonucleotide. The reservoir 
solution consisted of 0.9-1.5 M ammonium sulfate, 
3.75-6.0% 2-propanol and 25% glycerol, and the 
hanging drop was prepared by mixing 2|il of Exol- 
ssDNA complex with 2^1 of reservoir solution. Crystals 
typically grew within 1 week. For x-ray data collection, a 
single crystal was mounted in a nylon loop (Hampton 
Research) and plunged in Hquid nitrogen. For crystals of 
Exol in complex with Cy5-dT13, x-ray diffraction data 
were collected using a Rigaku RU-H3R rotating anode 
generator, R-AXIS IV++ image plate detector and 
integrated and scaled with CrystalClear (d*Trek) 
software (Molecular Structure Corporation). For crystals 
of Exol in complex with Cy5-dA13, dA16 and dT13, x-ray 
diffraction data were collected at beamline 31 -ID of the 
Advanced Photon Source (1 = 0.97931 A). Synchrotron 
data were integrated and scaled using MOSFLM and 
SCALA of the CCP4 suite (16). All data were further 
processed using TRUNCATE. Crystals with Cy5-dT13, 
Cy5-dA13 and dA16 belong to space group P4(3) with 
two molecules per asymmetric unit, and the diffraction 
data were twinned (twin operator — h, k, —1; twin 
fraction 0.20-0.33). Crystals with dT13 belong to space 
group P3(l)21 with two molecules per asymmetric unit 
with no twinning. AU structures were determined by mo- 
lecular replacement using the Auto Mol Rep function of 
CCP4 and the structure of uncomplexed Exol as a search 
model (PDB code IFXX). Structures were further refined 
using REFMAC5, and the models were fit using COOT 
(17). Refinement of the dA16, Cy5-dA13 and dT13 struc- 
tures at resolutions of 3.0-3.5 A included non- 
crystallographic symmetry (NCS) restraints. Amplitude- 
based twin refinement in REFMAC5 was used for the 
Cy5-dT13, Cy5-dA13 and dA16 structures. Diffraction 
data and refinement statistics are presented in Table 1. 
Solvent accessible surface area calculations were per- 
formed in CCP4 with a probe radius of 1.4 A. Structural 
figures were prepared using PyMOL (The PyMOL 
Molecular Graphics System, Schrodinger, LLC) and 
COOT (17). The structure factors and coordinates for all 
four crystal structures have been deposited in the Protein 
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Table 1. Data collection and refinement statistics' 
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Data Bank. The PDB accession numbers are 4JRP (Cy5- 
dT13), 4JRQ (Cy5-dA13), 4JS4 (dA16) and 4JS5 (dT13). 

Analytical ultracentrifugation 

Sedimentation velocity experiments were performed using 
a Beckman XL-I analytical ultracentrifuge. Purified Exol 
protein at 0.5mg/ml, alone or mixed with an equimolar 
amount of a dT16 ohgonucleotide, was diluted to 400 |il in 
20 mM Tris, pH 8.0, 150mM NaCl, and spun at 20°C for 
4h at 50000rpm. Data from interference optics were 
analyzed using the c(s) and c(M) model in the program 
Sedfit (18) to determine differential sedimentation 
coefficients. 

RESULTS 

Crystallization strategy and structural overview 

We will first describe the structure of Exol in complex with a 
5'-Cy5 labeled dTlB oligonucleotide, as it was determined 
to significantly higher resolution (1.95 A) than the others 
(Table 1). Initial attempts to crystalhze Exol in complex 
with various oligonucleotides used Ca^^ or an H181A 
variant to prevent nuclease activity, but this resulted in 
crystals that did not contain DNA, based on inspection of 
electron density maps. Using a gel-based assay, residual 
nuclease activity was observed under these conditions 
(Supplementary Figure SI), indicating that the ssDNA 
added to the crystallization mixture was likely being 
digested. However, addition of 10 mM EDTA with no 
added metal completely inhibited Exol activity, and so 



crystallization screens were performed in the presence of 
lOmM EDTA. The 5'-Cy5 labeled dT13 oligonucleotide 
was used so that crystals containing bound ssDNA would 
appear blue, and thus be readily distinguished from crystals 
without DNA. Crystallization screens yielded a single blue 
hit for a tetragonal crystal form grown in high salt that 
diffracted to 1.95 A on a rotating anode X-ray source. 
Molecular replacement resulted in solutions for two 
monomers of Exol in the asymmetric unit, and after 
initial refinement electron density maps allowed for place- 
ment of all 1 3 nucleotides of each Cy5-dT13 oligo, including 
the Cy5 groups (Figure 1). 

The two Exol-dT13 complexes in the asymmetric unit 
form a dimer with ~2-fold symmetry that buries 1140A^ 
of total solvent accessible surface area through contacts 
involving primarily the SH3-like domains (Supplementary 
Figure S2). The dimer is also stabilized by interactions 
involving the nucleotides at the 5'-end of each bound 
ssDNA, including the 5'-Cy5 groups. As described 
below, the same dimer is formed in crystals with non- 
labeled oligos, indicating that the Cy5 groups are not 
required for dimerization. Moreover, the complex with 
unlabeled dT13 ohgo, which crystallized in a different 
space group (P3i21 instead of P43), contained the same 
dimer, suggesting that the dimer is ubiquitous. Initial 
reports on Exol purification indicated the possibihty of 
a dimer (19), but gel filtration of more highly purified 
Exol protein indicated that it is a monomer in solution 
(20). A more recent observation that the activity of Exol 
at higher concentrations on a particular SSB-ssDNA sub- 
strate was cooperative, again suggested the possibility of 
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Complex A Complex B 



Figure 1. Electron density for the DNA in the Cy5-dT13 complex. The 
blue cage shows an Fo-F^. map calculated with model phases after 
40 cycles of restrained refinement in REFMAC5 with the ssDNA 
from complex A (left) or complex B (right) omitted. Each map is con- 
toured at +3.0 sigma and shown over the entire complex. The Cy5 
groups at the 5'-end of each DNA are evident at the top of the 
figure. The 3'-end of each DNA binds to the active site at the bottom. 



Exol multimers, perhaps induced by DNA binding (9). To 
test if DNA binding induces Exol dimerization directly, 
sedimentation velocity was performed on Exol alone and 
in the presence of a dT16 oligonucleotide. The results 
clearly indicate a monomer in both cases, and hence no 
change in oligomeric state on DNA binding 
(Supplementary Figure S3). Moreover, the amino acid 
residues at the interface of the dimer observed in the 
crystal are not conserved in Exol orthologs 
(Supplementary Figure S4). Based on the available 
evidence, we consider it unUkely that the Exol dimer 
observed in the crystal is biologically relevant. 

Rather remarkably, the two complexes of Exol in the 
Cy5-dT13 crystal, which we will refer to as complexes A 
and B, have the ssDNA bound in two distinct conform- 
ations (Figure 2). Both complexes have the ssDNA 
threaded under the crossover loop, but in complex A, all 
13 nucleotides of the ssDNA have entered the DNA- 
binding groove, whereas in complex B, only the tirst 10 
nucleotides, counting from the 3' -end, have entered the 
groove. At the bottom of the groove, the terminal 3'-nu- 
cleotide is bound to approximately the same position 
within the active site of each complex. At the top of the 
groove, three nucleotides in each complex are bound by an 
extensive and equivalent set of interactions to a site on the 
protein that we will call the 'anchor' site. However, due to 
the difference in registration, in complex A, it is nucleotides 
11-13 of the ssDNA that are bound to the anchor site, 
whereas in complex B, it is nucleotides 8-10. The region 
of the ssDNA between the active site and the anchor site, 
which we will call the 'bulge', is larger for complex A than 
for complex B (nucleotides 6-10 versus 5-7), but both 
bulge regions make a similar contact to the SH3-like 
domain at the left side of the DNA-binding groove, as 
viewed in Figure 2. The interactions at the anchor site, 
the active site and the bulge regions of complexes A and 
B, will be described in more detail below. 

Based on a structural superposition (Figure 3 and 
Supplementary Table SI), complexes A and B differ 




B Anchor site 




D Active site 



Figure 2. Two distinct complexes of Exol bound to Cy5-dTI3. 
(A) Structural overview. Complexes referred to as 'A' and 'B' in the 
text are shown in each panel on the left and right, respectively. The 
Cy5-dT13 oligo is shown in red with the 5'-Cy5 omitted. Exol is 
colored by domain: exonuclease domain, yellow (residues 1-201); 
SH3-like domain, green (residues 202-354); C-terminal domain, blue 
(residues 355^75). The crossover loop shown in complex B is not 
present in the model for complex A due to weak electron density. 
(B-D) Close-up views of the interactions at the anchor site, bulge 
and active site regions of each complex, respectively. Dashed lines 
indicate electrostatic (<4.0A) and hydrogen bonding (<3.5A) inter- 
actions between the protein and DNA. Nucleotides of the Cy5-dT13 
oligo are numbered T1-T13 starting from the 3'-end. 



mainly in the conformation of the DNA, as opposed to 
the protein. The protein portions of the two complexes 
superimpose to an rmsd of 1.3 A for all Ca atoms. The 
structural differences do not involve large-scale domain 
movements, but instead are primarily confined to a 
segment of the SH3-like domain (residues 268-297) that 
moves inward by ~4 A in complex B to maintain its contact 
with the shorter bulge region (Figure 3B). There are also no 
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Figure 3. Overlay of complexes A and B observed in the Cy5-dT13 
structure. (A) Complexes A (cyan) and B (magenta) were superimposed 
using all protein Cot atoms (rmsd 1.3 A). Notice that all 13 nucleotides 
of the ssDNA are bound within the Exol groove in complex A, whereas 
in complex B, only the first 10 nucleotides are bound to the groove, 
such that the three 5' nucleotides extend from the surface of the 
complex. The 3'-nucleotide in the active site and the three nucleotides 
in the anchor site of each complex overlap closely, whereas those in the 
bulge region differ significantly. (B) Close-up view of the interaction 
with the bulge region. A portion of the SH3-like domain of complex B 
rotates inward by ~4A to maintain its contact with the shorter bulge 
region. The superposition is the same as in panel A, but zoomed in on 
the region that shows the largest difference in the conformation of the 
protein. 

large-scale doinain inoveinents that occur in Exol on DNA 
binding. Complexes A and B align to previously reported 
structures of unbound Exol to rmsd values ranging from 
0.9 to 1 .5 A, and 1 .4 to 1 .8 A, respectively, for all Ca atoms 
(Supplementary Table SI). Again, the most significant dif- 
ferences between DNA-bound and unbound Exol struc- 
tures occur in the segment of the SH3-hke domain that 
contacts the bulge region of the DNA. In particular, 
residues 277-295 of this segment adopt a particularly 
closed conformation in the original Exol structure 
(IFXX; 12), which would clash with the bulge region of 
the ssDNA in both complexes. However, in other crystal 
structures of uncomplexed Exol, this segment of the SH3- 
like domain is disordered (13,21), suggesting that it is in 
general a highly flexible region of the structure, as 
opposed to one that adopts distinct open and closed 
states in the presence and absence of DNA, respectively. 

Anchor site 

The interactions at the anchor site, which involve all 
three domains of the protein and three consecutive nucleo- 
tides of the ssDNA, are essentially the same for 
both complexes, despite the difference in registration 
(Figure 2B; a schematic view of ExoI-DNA interactions 
is provided in Supplementary Figure S5). The three nu- 
cleotides at the anchor site are bound with the phosphates 



buried and the bases predominantly exposed, with their 
Watson-Crick edges sticking out into solution. 
A proininent interaction is the insertion of the phenyl 
ring of Phe371 between the bases of nucleotides 12 and 
13 of complex A, or 9 and 10 of complex B, counting from 
the 3'-end. This interaction appears to set an integral 
registry of the ssDNA relative to the protein. The 
aromatic rings of Trpl28, Tyrl24 and Tyr368 line the 
right side of the groove to form a hydrophobic waU that 
contacts the ribose groups of the three nucleotides. 
A number of polar residues, including Argll3, Argl34, 
Lys214, Asn257 and Asn304, hne the bottom of the 
groove to form hydrogen bonds and ion pairs with the 
phosphates, particularly those of nucleotides 11 and 12 
of complex A, or 8 and 9 of complex B. Most of the 
residues that contact the ssDNA at the anchor site are 
invariant or highly conserved in distant Exol orthologs 
(Supplementary Figure S4), suggesting that these inter- 
actions are biologically relevant. 

Active site 

At the bottom end of each complex, the 3'-end of the 
ssDNA binds to the active site of the exonuclease 
domain (Figure 2D). The terminal 3'-nucleotide of the 
ssDNA overlaps closely with the dTMP bound in a 
previous structure of Exol, which was formed as a diges- 
tion product (13). The terminal 3'-OH forms a close 
hydrogen bond (2.7 A) with the backbone amide of 
Thrl8. This explains why Exol does not cleave 
3'-phosphorylated ssDNA (4), as a 3'-phosphate would 
clash with the backbone of Thrl8 to push the scissile 
bond out of the active site. The base of the terminal nu- 
cleotide binds to a hydrophobic pocket formed by the side 
chains of Thr21, Ala63 and Ile66. In complex A, the 
scissile phosphate of the terminal nucleotide lies near the 
carboxylate groups of Aspl5, Glul7 and Aspl86, which 
are the conserved metal-binding residues. Despite the 
presence of lOmM EDTA in the crystallization mixture, 
clear electron density is observed for an active site metal 
ion, which lies at the position of Mg^ in related two-metal 
nucleases (Figure 4) (22,23). This atom is modeled as an 
Mg^^ ion, based on its distinct octahedral coordination to 
the scissile phosphate (OIP and 03' atoms), the carboxyl- 
ate group of Asp 15 and three water molecules. No 
electron density is observed at the expected position for 
Mg^, which is consistent with the fact that the bound 
DNA is not cleaved. The scissile phosphate in complex 
A is also near the imidazole side chain of Hisl81, which 
is thought to help align and activate the attacking water 
molecule that would be bound to Mg'*'. The interactions 
observed in complex A thus appear close to what would be 
expected for a catalytic complex, except for the absence of 
Mg"^ and the attacking water. This conclusion is further 
supported by the fact that the first three nucleotides 
and Mg^ align well with the trinucleotide bound to the 
proofreading domain of E. coli DNA polymerase I (Figure 
4B) (24). By contrast, in coinplex B, although the 3'-OH is 
bound in the same position as in complex A, there are no 
bound metal ions, and the scissile phosphate is^ rotated 
upward, such that it is more distant (>5.4A) from 
the active site carboxylates (Figure 4C). Complex B thus 
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Figure 4. Electron density for a Mg ^ ion in the active site of complex A. (A) Stereo view of the 2Fo(,s-Fe;,ic electron density map for the active site of 
complex A. The Mg^^ ion (magenta) is coordinated with near octahedral geometry to the scissile phosphate (OIP and 03' atoms), the carboxylate of 
Asp 15 and three water molecules. This is the expected position for Mg'^ in related two-metal enzymes. No electron density is present at the expected 
position for Mg'^. (B) Superposition of the active site of complex A with the structure of E. coU DNA polymerase I in complex with a trinucleotide 
(PDB code IKFS). For Exol, the protein is in yellow, DNA in red and Mg^^ magenta. For Pol I, the protein, DNA and metal ions (Mg^^ and Zn^^) 
are shown in cyan. Notice that the first two nucleotides of each complex, including the scissile phosphates, overlap closely, indicating that Exol 
complex A is in a near catalytic conformation, despite the absence of Mg"^. (C) Overlay of the active sites of complex A (yellow and red) and 
complex B (green). The two bound ssDNA molecules overlap at the 3'-OH but otherwise diverge. 



appears considerably less poised for cleavage than 
complex A. 

Moving along the ssDNA from the 3'-end in the active 
site, the nucleotides in each of the two complexes rapidly 
diverge from one another. In complex A, the first five 



nucleotides are in an approximate B-form conformation, 
with the bases predominantly exposed and stacked over 
one another, and the phosphates forming hydrogen bonds 
and electrostatic interactions with three backbone amides 
(Leul66, Phel64, Ala238) and two positively charged side 



Nucleic Acids Research, 2013, Vol. 41, No. 11 5893 



chains (Argl42 and Argl65). By contrast, nucleotides 2-5 
of complex B follow a more irregular path, with minimal 
base stacking and fewer interactions with the protein 
(Figure 2D). Interestingly, the phosphates of nucleotides 
3, 5 and 6 in complex B are pinched together to within 
about 4 A of one another in the center of the bulge 
(Figure 2C). Such an interaction, which is clearly 
observed in the structure owing to the strong density of 
the phosphates, would be energetically unfavorable, owing 
to electrostatic repulsions of the phosphates. If complex B 
is biologically relevant, crowding of these phosphates 
could conceivably create a tension that could be coupled 
in some way to translocation of Exol along the ssDNA to 
initiate further rounds of cleavage. 

Bulge region 

The next three nucleotides of complex A, nucleotides 6-8, 
make a turn at the tip of the bulge, forming a close inter- 
action with a loop formed by residues 284-287 of the SH3- 
like domain (Figure 2C). A prominent interaction involves 
Tyr284, which wedges between the bases of nucleotides 7 
and 8. A similar interaction between the SH3-hke domain 
and the tip of the bulge occurs in complex B, where 
Tyr284 inserts between the bases of nucleotides 5 and 6. 
In complex B, this region of the SH3-like domain moves 
inward by ~4A to maintain its contact with the shorter 
bulge region (Figure 3B). Due to the inherent flexibility, 
the electron density for this region of the structure, both 
for the protein and for the ssDNA, is weaker than for 
other parts of the structure, as reflected by higher tem- 
perature factors. Tyr284 appears to be conserved as Tyr 
or Phe in Exol orthologs, although the length of this loop 
is variable, making this region difficult to align 
(Supplementary Figure S4). 

The ssDNA in both complexes is threaded under a 
stretched out 'crossover' loop that connects the SH3-hke 
domain on one side with the helical domain on the other 
(Figure 2C). In complex A, the electron density did not 
allow for tracing of residues 355-359 of this loop, while in 
complex B, the electron density was shghtly more clear and 
the entire loop is included in the model, albeit with high- 
temperature factors. The nucleotides of the ssDNA that 
bind under the crossover loop are in a rather extended con- 
formation in both complexes, such that the bases are 
splayed out, making minimal interactions with the protein 
or with one another. A prominent interaction seen in 
complex B is the insertion of the phenyl ring of Phe357 of 
the crossover loop between the bases of nucleotides 6 and 7. 
A similar interaction could also occur in complex A, in this 
case with Phe357 inserting between bases 8 and 9, but weak 
electron density for this region of complex A did not permit 
tracing of the crossover loop. Moreover, Phe357 is not well 
conserved in Exol orthologs (Supplementary Figure S4), 
suggesting that this interaction is not critical for function. 

Crystal structures of Exol with additional ssDNA 
substrates 

To determine the extent to which the two complexes 
observed in the Cy5-dT13 structure described above are 
prevalent, crystal structures of Exol in complex with 



three additional ssDNA substrates were determined. 
Complexes with unlabeled d A 1 6 and Cy 5-d A 1 3 crystallized 
in the same space group as Cy5-dT13 (P43), while the 
complex with unlabeled dT13 crystallized in space group 
P3i21, although it still has the same dimer of Exol present 
in the asymmetric unit. The three new structures were 
determined to lower resolution, 3.0-3.5 A, and thus do 
not show the detailed protein-DNA interactions as well 
as in the Cy5-dT13 structure. Despite the lower resolution, 
electron density maps allowed for placement of all of 
nucleotides of each ssDNA, except for the Cy5 groups of 
Cy5-dA13, the two 5'-terminal nucleotide of dA16 and two 
nucleotides in the central bulge of dT13 (Figure 5). The 




B Cy5-dA13 




c dT13 



Figure 5. Structure of Exol in complex with additional ssDNA sub- 
strates. (A) dA16, (B) Cy5-dA13, (C) dT13. The figure on the left of 
each panel shows an F^-F^ map calculated with model phases after 40 
cycles of restrained refinement in REFMAC5 with the ssDNA omitted. 
Each map is contoured at +3.0 sigma and shown over the entire 
complex. The figure on the right shows a superposition of each struc- 
ture (green and blue bonds) with complex A of the Cy5-dT13 structure 
(yellow bonds). Notice that the ssDNA in each new structure is bound 
in a similar conformation as the ssDNA in complex A of Cy5-dT13. 
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most significant finding from the three additional structures 
is that all six of the new Exol-ssDNA complexes (two 
complexes for each crystal structure) have the ssDNA 
bound in a conformation that is close to that of complex 
A described above, suggesting that complex B of the Cy5- 
dT13 crystal structure may be an outlier. 

Although we report only three additional refined struc- 
tures, crystals similar to those with dA16 (which are 
hkewise similar to Cy5-dA13) could be grown with dA 
ohgos ranging from 13-16 nucleotides in length. 
Preliminary analysis of a crystal with dA14 indicates 
that the structure is essentially identical to the complex 
with dA16. Likewise, crystals similar to those with 
unlabeled dT13 could be grown with shghtly longer dT 
ohgos (dT14-dT16), and preliminary analysis of a 
crystal with dT16 indicates that the structure is essentially 
identical to the dT13 complex. These observations indi- 
cate that the length of the ssDNA does not affect the 
observed binding mode: the first ~13 nucleotides of the 
ssDNA are bound to the Exol groove in essentially the 
same conformation in all four of the structures (dT13, 
dT16, dA14 and dA16), such that the extra 5' -nucleotides 
of dT16 and dA16 simply extend out from the surface of 
the complex. 

DISCUSSION 

A hallmark of Exol is its high degree of processivity. 
While the two-metal mechanism has been structurally 
weU established for numerous Mg^"^-dependent nucleases 
(3,22,23), what is less clear is the mechanism by which 
enzymes such as Exol are able to translocate along the 
ssDNA in a rapid and processive manner. In the 
original Exol structure, encircling of the ssDNA by the 
crossover loop was proposed to be a key to processivity 
(12). Topological hnkage is a recurring theme for other 
processive enzymes, such as k exonuclease, PCNA and 
others (25). However, in an early report on processive 
nucleases, before any of their structures were known, a 
dual binding-site model was proposed (7). This model 
invoked the concept of an 'anchor' site on the enzyme, 
separate from the active site, that could allow the 
enzyme to remain bound to one site on the DNA as it 
stepped forward at the other. The crystal structures of 
the Exol-ssDNA complex reported here show both 
features: threading of the ssDNA under the crossover 
loop and dual binding of the ssDNA to the active site 
and the anchor site. The extent to which both of these 
features, topological hnkage and two-site binding, are ne- 
cessary for processivity remains to be determined. 

The crystal structures reported here have captured two 
distinct states of Exol bound to ssDNA that could provide 
important clues to understanding the mechanism of trans- 
location. However, it is important to consider whether 
both of the structures are likely to be biologically 
relevant, or whether one of the observed states is a struc- 
tural artifact, due to interactions of the Cy5 groups or 
other crystal packing forces. Complex A is remarkably 
consistent with the biochemical studies of Brody and col- 
leagues, who were able to make detailed predictions about 
the Exol-ssDNA interactions from quantifying the 



distribution of products released from digestions of 
various ohgonucleotide substrates under conditions of 
limiting enzyme (6,15). Based on this analysis, it was pre- 
dicted that Exol interacts with two separate sites on the 
ssDNA: an 'extended active site' starting at the 3'-terminal 
nucleotide and extending for about six nucleotides, and an 
'anchor' site located at nucleotides 9-13. Comparison of 
these predictions with the interactions observed in 
complex A, as shown schematically in Supplementary 
Figure S5, reveals a remarkably close agreement. It was 
further predicted, based on analysis of ohgonucleotide 
substrates containing abasic residues or methylpho- 
sphonates at specific positions, that while the interactions 
at the extended-active site require both the bases and the 
phosphates, those at the anchor site require only the 
sugar-phosphate backbone. Again, there is a reasonable 
agreement. While there is some contact with the bases at 
the anchor site, most notably the insertion of Phe371 
between the bases of nucleotides 12 and 13, for the most 
part the bases at the anchor site are largely exposed, while 
the phosphates are fully buried, making extensive inter- 
actions with the protein. In addition to being consistent 
with the biochemical studies, complex A has the ssDNA in 
close to a catalytic configuration, despite the presence of 
only one metal ion. This is seen by the fact that the first 
three nucleotides in complex A overlap closely with a 
trinucleotide substrate bound to the proofreading 
domain of E. coli DNA Pol 1 in the presence of metals 
(24). Finally, as shown in the sequence alignment of 
Supplementary Figure S4, most of the key residues of 
Exol that contact the DNA in complex A are predomin- 
antly conserved in distant Exol orthologs. Thus, complex 
A appears to be a functionally relevant state. 

Complex B is overall similar to complex A in that the 
3'-end of the ssDNA is bound to the active site, the down- 
stream end is threaded under the crossover loop and 
similar contacts are made at the bulge region and the 
anchor site. Complex B differs from complex A in that 
only the first 10 nucleotides from the 3'-end have entered 
the binding groove, such that the registry of the protein 
relative to the ssDNA at the anchor site is off by exactly 
three nucleotides. There is consequently a shorter bulge of 
the ssDNA in complex B, and the interactions at the active 
site are not as extensive and further from what would be 
expected for a catalytic complex. It is possible that the Cy5 
label of complex B, which is clearly visualized in the struc- 
ture due to its interactions at the dimer interface, could 
have pulled the ssDNA partially out of the binding groove 
to result in a structural artifact. The fact that the conform- 
ation of complex B was not seen in any of the structures 
with unlabeled oligos suggests that this may be the case. 
On the other hand, the Cy5 group apparently did not 
interfere with complex A, which also has its Cy5 group 
forming close interactions at the dimer interface, or in the 
structures observed with Cy5-dA13, for which the Cy5 
groups are not visible in the electron density. It is also 
interesting to note how a region of the SH3-like domain 
of Exol moves inward by ~4 A in complex B, to maintain 
its contact with the shorter bulge region. It thus appears 
that Exol has a built-in flexibility to accommodate differ- 
ent conformations of bound DNA. 
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While the overall conclusions from Brody et al. (6,15) 
are in general agreement with complex A, some of their 
observations can perhaps be more easily reconciled with 
complex B. For example, for all of the different oligo- 
nucleotides that were digested, only small amounts of 
11-mers or longer oligonucleotides were released from 
the enzyme (0-6% of the total), indicating that Exol 
binds nearly as tightly to the 11-mer as to longer ohgo- 
nucleotides. This would suggest that any interactions of 
Exol occurring beyond the 1 1th nucleotide are not critical 
for affinity. However, in complex A the main contacts at 
the anchor site are with nucleotides 11-13, with the phos- 
phate of nucleotide 12 having particularly close inter- 
actions. By contrast, in complex B the shift in registry 
places the equivalent set of interactions at nucleotides 

8- 10. Significantly higher levels of 10-mers and especially 

9- mers are released as end products of Exol digestions, 
indicating that interactions at nucleotides 11 and 10 con- 
tribute significantly to binding. In general, the most 
significant effect is loss of nucleotide 10, which results in 
release of high levels of 9-mer (20-50% of the total). 
In complex A, nucleotide 10 does not make close 
contacts to the protein, as there are only long range 
(5.8 A) electrostatic interactions to the phosphate and a 
weak (3.3 A) hydrogen bond to the base. By comparison, 
in complex B, nucleotide 10 is one of the three that 
are right at the anchor site. While not all of the observa- 
tions of Brody et al. can be reconciled with complex B in 
this way, in general the biochemical data tend to indicate 
that complex B may be relevant. One possibility is that 
while complex A represents a catalytic state, complex B 
may be representative of the end state of a processive 
digestion. 

Based on the available structural information, one 
could envision two fundamentally different models for 
how Exol processively digests ssDNA. In one model 
(Figure 6A), in which only the binding mode of complex 
A is relevant, Exol translocates by one nucleotide step 
along the DNA after each round of cleavage, to 
generate a new complex A at each step. Such a model 
would require breakage of several close hydrogen bonds 
and base insertion interactions at the anchor site after 
each individual round of cleavage, but each translocation 
step could conceivably be coupled in some way to the 
energy released from cleavage of the terminal phospho- 
diester bond, estimated at —5.3 kcal/mol (26). The 
second model (Figure 6B) assumes that both of the 
observed binding modes are relevant. Starting with 
complex A, in which Exol is bound to 13 nucleotides 
from the 3'-end, three rounds of cleavage, without dis- 
placement of the nucleotides bound to the anchor site, 
would generate the structure observed in complex B, in 
which Exol covers 10 nucleotides from the 3'-end. 
Positioning of the scissile bond of the terminal nucleotide 
at each round of cleavage could be accommodated by 
shortening of the bulge region of the ssDNA, together 
with inward movement of the SH3-hke domain. In 
comparing the structures of complexes A and B, it 
appears that such intermediate states could be easily 
accommodated. We note, however, that the conformation 
of the ssDNA in the active site of complex B is different 




Figure 6. Cartoon depicting two possible mechanisms for Exol 
processive digestion. (A) Single-nucleotide stepping mechanism in 
which translocation of Exol along the ssDNA occurs by one nucleotide 
step after each individual round of cleavage. (B) Three-nucleotide 
stepping mechanism in which translocation of Exol along the ssDNA 
occurs by a three-nucleotide step after every three rounds of cleavage. 
The first model uses only the binding mode observed in complex A, 
whereas the second model invokes the two distinct binding modes 
observed in complexes A and B. 



from complex A, particularly for the third and fourth nu- 
cleotides (Figure 2D). We envision that intermediate states 
that are active for cleavage would more closely resemble 
complex A and that the conformation of complex B may 
occur when the ssDNA bound in the groove is too short to 
bind as seen in complex A. After three rounds of cleavage, 
the ssDNA would be too short to allow the 3'-end to fully 
reach the active site, and so further rounds of cleavage 
would require release of the nucleotides from the anchor 
site to generate a new complex A in which the enzyme 
has translocated three nucleotides forward along the 
DNA. In this way, Exol would translocate along the 
DNA with a stepsize of three nucleotides, with three 
rounds of cleavage occurring between each translocation 
step. A possible advantage of this type of mechanism 
is that breakage of the interactions made by the three 
nucleotides at the anchor site would only need to occur 
once for every three rounds of cleavage, instead of after 
each individual step. 

Intriguingly, a similar 'stepping' type of mechanism has 
recently been proposed to occur for Rrp44, a catalytic 
subunit of the yeast exosome complex, which processively 
digests single-stranded or structured RNA in the 3'-5' dir- 
ection (27). Single molecule FRET analysis of Rrp44 
digesting a 3'-tailed 43 bp A-form RNA duplex revealed 
that the enzyme unwinds the duplex not in single base pair 
steps but rather in bursts of ~4bp. The data further 
indicated that unwinding, as opposed to the chemical 
cleavage steps, is rate limiting. It was surmised that as 
the enzyme cleaves one nucleotide at a time from the 
3'-tail, it gets successively closer to the duplex region, 
building up tension within the protein that is elastically 
coupled to duplex unwinding. Exol is fundamentally 
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different from Rrp44 in that it digests ssDNA instead of 
RNA and has a much lower capacity to unwind regions of 
duplex (4). Nonetheless, it is interesting to speculate that 
the translocation model described for Exol in Figure 6B 
could conceivably generate a similar type of stepping 
pattern as that observed for Rrp44. It is also interesting 
to note that the ExoI-ssDNA complex shares some 
common features with Rrp44 and the related ribonuclease 
II of E. coli, for which structures have been determined in 
complex with 13-mer oligoribonucleotides (28,29). All 
three enzymes have five nucleotides at the 3' -end inserted 
into an active-site cleft, a turn or bulge region in which the 
nucleotides are more splayed out, and an 'anchor' region 
at which close interactions are formed with the down- 
stream end of the substrate. In the case of RNase II, a 
single-step translocation mechanism was proposed, as 
only one binding mode of the protein on the substrate 
was observed (29). A single binding mode was also 
observed for Rrp4, although it is interesting to note that 
this structure differed significantly from that of RNase II 
in the way that the downstream portion of the DNA was 
bound (28). 

It is also interesting to compare the two mechanisms 
described above for Exol to that of X exonuclease, an 
enzyme that processively digests one strand of double- 
stranded DNA in the 5'-3' direction. X Exonuclease 
forms a trimer with a central funnel-shaped channel 
for tracking along the DNA (30,31). The duplex is 
unwound ahead of the cleavage site by exactly 2 bp, such 
that the 5'-end inserts into one of the three active sites on 
the trimer, and the 3'-end threads through the central 
channel to emerge out the back. Processivity is proposed 
to be due to encircling of the 3'-ended strand as the trimer 
motors along the DNA, and translocation may be driven 
in part by attraction of the 5'-phosphate generated at each 
round of cleavage to a positively charged pocket at the 
end of the active-site cleft (31). Shding of the trimer 
along the DNA, presumably in single base pair steps, 
appears to be facilitated by a rather loose set of electro- 
static interactions with the DNA in the central chan- 
nel, combined with the insertion of a key arginine 
residue into the minor groove of the downstream 
portion of the duplex. Because the binding of the 
arginine to the minor groove involves electrostatic attrac- 
tion to the phosphates (32), as opposed to specific 
hydrogen bonds to the bases, it was proposed that the 
arginine could act as a shding rudder to keep the 
enzyme on track as it translocates along the DNA (31). 
By comparison, the interactions of Exol with the down- 
stream portion of the ssDNA at the anchor site seem 
less well suited to sliding, as they involve a network of 
several close hydrogen bonds, as well as insertion of 
aromatic residues between the bases. In this regard, the 
three-nucleotide stepping mechanism described above 
for Exol is particularly attractive. 
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